-
Notifications
You must be signed in to change notification settings - Fork 856
Keccak implementations discarting benches #948
Conversation
|
I do agree with that. But this type of benchmark just compiled and was run for the first time today thanks to some changes to the zkevm-chain params. So I think for selection purposes, with the SuperCircuit should be enough.
Should we file an issue for that and work on it? If you want, you can take it directly @Brechtpd !
Well, here you can do various things. You can try to fetch an ENV var and if it doesn't give any value, default to something. Would that work? JIC, I'll run the benches with the three impls since this might bring some light anywhay. |
Results for Multi Packed
|
I don't know the reason for the change so would like to know that first before changing it back.
Yeah that is what is done now using the |
As for |
I remember even if bit keccak is farest in super circuit, it is slowest in aggregation. Each column in target circuit brings huge overhead in aggregation. Previous we switched to multi_pack not because it is fast itself But it has least columns |
In the pr to switch to multi pack I and Han already posted many realistic and theory results talking sub circuit and aggregation circuit |
@CPerezz To clarify a little bit I was not in the discussion actually (that happened in Halo 2 Ecosystem discord), I only know it's said we need to be careful about the API change to avoid enable soundness breakage (which makes sense), not directly leading to soundness issue, so perhaps we might also experiment it before upstream if it's really painful to pass many const generics. |
Apologies @han0110 as I missunderstood you. Do you think is worth running the benches on the aggregation circuit with the keccak impls? Or shall we just use multi-packed directly? @Brechtpd @han0110 @lispc I know it's configurable as @Brechtpd told me. So I'm sure we can "shape" it to be longer (more rows) and less wide. Does this work for you all? |
9bb64a7
to
1abe872
Compare
Results for Bit Approach
Much bigger memory consuption while not giving any better performance. It's aprox 500s faster than I think considering the difference with the main time taken to execute all, we should definitely just take |
These were some results of the benches I ran on aws (c5.18xlarge). One extra caveat was that I used modified FFT implementation that is up to 3x faster than the one currently in halo2.
Another caveat is that the multi-packed was only run on a small circuit (2^16), and a higher circuit would also be significantly faster because of the reduced number of lookups required so the difference in practical scenario's will no doubt be smaller. Then of course the other concern is performance of the aggregation circuit. For now But I totally agree with how things are now |
Results for Packed Approach
This makes clear together with the comments that we should mainly just use |
1ddd0f9
to
53f0f54
Compare
53f0f54
to
4a6b55a
Compare
@Brechtpd could you also review this? You've been the biggest contributor in this area. So would be nice to count with your approval to do this!! I can't assign you directly. |
…ns#948) * feat: Integrate KeccakPackedMulti with Challenge-API * fix: Clippy lints * fix: Pay privacy-scaling-explorations#925 tech debt * remove: Keccak packed implementation * remove: Keccak-bit implementation * change: Export KeccakPackedConfig as KeccakConfig * remove: Benchmarks of discarted keccak impls
…ns#948) * add ec pairing invalid len > 65536 test * refine test * refine * fix: call_data_len can be 4 bytes long
In this PR we will run the three actual implementations of keccak with the ChallengeAPI already placed in.
With the results (which will be the SuperCircuit benchmarks where keccak is needed) we will chose only 1 implementation to keep and will remove the rest.
This should resolve: #941 and #943 as also the label to trigger these benchmarks was added.